Àá½Ã¸¸ ±â´Ù·Á ÁÖ¼¼¿ä. ·ÎµùÁßÀÔ´Ï´Ù.
KMID : 1132720200180020013
Genomics & Informatics
2020 Volume.18 No. 2 p.13 ~ p.13
Using the PubAnnotation ecosystem to perform agile text mining on Genomics & Informatics: a tutorial review
Nam Hee-Jo

Yamada Ryota
Park Hyun-Seok
Abstract
The prototype version of the full-text corpus of Genomics & Informatics has recently been archived in a GitHub repository. The full-text publications of volumes 10 through 17 are also directly downloadable from PubMed Central (PMC) as XML files. During the Biomedical Linked Annotation Hackathon 6 (BLAH6), we experimented with converting, annotating, and updating 301 PMC full-text articles of Genomics & Informatics using PubAnnotation, a system that provides a convenient way to add PMC publications based on PMCID. Thus, this review aims to provide a tutorial overview of practicing the iterative task of named entity recognition with the PubAnnotation/PubDictionaries/TextAE ecosystem. We also describe developing a conversion tool between the Genia tagger output and the JSON format of PubAnnotation during the hackathon.
KEYWORD
named entity recognition, natural language processing, text mining
FullTexts / Linksout information
 
Listed journal information
ÇмúÁøÈïÀç´Ü(KCI) KoreaMed